NETWORK PERFORMANCE MONITORING 
IN A CONTENT DELIVERY SERVICE 

RELATED APPLICATIONS 
[0001] This application is a continuation of U.S. application Serial No. 09/620,658, 
filed July 20, 2000 and entitled "Network Performance Mointoring in a Content Delivery 
Service''. This application is related to U.S. Patent No. 6,108,703, issued August 22, 2000, filed 
May 19, 1999, which application was based on provisional application Serial No. 60/092,710, 
filedJuly 14, 1998. 

BACKGROUND OF THE INVENTION 

Technical Field 

[0002] This invention relates generally to information retrieval in a computer network. 
More particularly, the invention relates to a novel method of hosting and disfaibuting content 
on the Internet that addresses the problems of Internet Service Providers (ISPs) and Internet Content 
Providers. 

Description of the Related Art 

[0003] The World Wide Web is the Internet's multimedia information retrieval system. In 
the Web environment, client machines effect transactions to Web serveas using the Hypertext Transfer 
Protocol (hi IP), whidi is a known q>plication protocol providing users access to files (e.g., text, 
graphics, images, sound, video, etc.) using a standard page desaiption langua^ known as Hypertext. 
Maiktp Language (HTML). HTML provides basic document formatting and allows the developer to 
specify "links" to ottier soveis and files, fo the Internet paradigrn, a netw(Mk pafli to a servo: is idoitified by a 
so-called Uniform Resource Locator (URL) having a special syntax for defining a network connection. Use 
of an HTML-compatible browser (e.g., Hdtscapd Navigator or Microsoft Internet Explorer) at a 
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client machine involves specification of a link via the URL. In response, the client makes a 
request to the server identified in the link and, in retum, receives a document or other object 
formatted according to HTML. A collection of documents supported on a Web server is 
sometimes referred to as a Web site. 

[0004] It is well known in flie prior art fir a Web ate to minor its content at another server. 
Indeed, at present, the only method for a Content Provider to place its content closer to its 
readers is to build copies of its Web site on madiines that are located at Web hosting farms in diflfenent 
locations domestically and internationally. These copies of Wd) sites are known as minor sites. 
Unfortunately, mirror sites place unnecessary economic and operational burdens on Content 
PiDvideis, and they do not oflFer economies of scale. Economically, ttie overall cost to a Contmt Provider 
with one primary site and one minor site is more tiian twice the cost of a single primary site. This 
additional cost is the result of two factors: (1) the Content Provider must contract with a sq)arate 
hosting fedlity for eadi minor site, and (2) the Contort Provider must incur additional overiiead expenses 
associated with keeping the minor sites synchronized. 

[0005] In an effort to address problems associated with mirroring, companies such as 
Cisco, Resonate, Bright Tiger, F5 Labs and Alteon, are developing software and hardware that 
will help keep mirror sites synchronized and load balanced. Although these mechanisms are 
helpful to the Content Provider, fhey feil to address flie underlying problem of scalability. Even if a 
Content Provider is willing to incur the costs assodated with mirroring, the technology itself will not 
scale beyond a few (i.e., less than 10) Web sites. 

[0006] In addition to these economic and scalability issues, minoring also entails opoational 
difficulties. A Content Provider feat uses a minor site must not only lease and manage physical space in 
distant locations, but it must also buy and maintain the software or hardware feat syndnonizes and load 
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balances the sites. Current solutions require Content Piovidas to si^ly personnd, technology and 
otha: itons necessary to maintain multiple Web sites. In summary, mirroring requires Content 
Providosto waste economic and other resouix» on fimctions to core business of 

creating content. 

[0007] Moreover, Content Providers also desire to retain control of their content. 
Today, some ISPs are installing caching hardware that interrupts the link between the Content 
Provider and the end-user. The effect of such caching can produce devastating results to the 
Content Provider, including (1) preventing the Content Provider from obtaining accurate hit counts 
on its Web pages (thereby decreasing revenue from advertises), (2) prevoiting the Contmt Provider 
from tailoring content and advertising to specific audiences (which severely limits the effectiveness 
of the Content Provider's Wd) page), and (3) providing outdated infonnation to its customers (which can lead 
to a fiusttated and angry end user). 

[0008] There remains a significant need in the art to provide a decentralized hosting 
solution that enables users to obtain Intemet content on a more eflScient basis (i.e., without 
burdening network resources unnecessarily) and that likewise oiables the Contoit Provider to 
maintain control over its content. The present invention solves ttiese and oflio: problems associated with 
the prior art. 

BRIEF SUMMARY OFTHE INVENTION 

[0009] The present invention provides a computer network comprising a large number of 
widely deployed Intemet servers that form an organic, massively fault-tolerant infiBStructure 
designed to serve Web content efficiently, effectively, and reliably to end users. 

[0010] Another aspect of the presoit invention is to provide a fundamentally new and better 
method to distribute Web-based content. The inventive architecture provides a method 
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for intelligently routing and replicating content over a large network of distributed servers, 
preferably with no centralized control. 

[0011] Anotho: aspect of ttie present invaition is to provide a network architecture that moves 
content close to the user. The inventive architecture allows Web sites to develop large 
audiences without worrying about building a massive infrastmcture to handle flie associated traffic. 

[0012] Still anotha- aspect of the presoit invaition is to provide a fault-tolerant network for 
distributing Web content. The network architecture is used to speed-up the delivery of richer 
Web pages, and it allows Content Provides with large audi«ices to save them reliably and economically, 
preferably from servers located close to end users. 

[0013] A further aspect of the present invention is the ability to distribute and manage 
content over a large network without disrupting the Content Provider's direct relationship with the 
end user. 

[0014] Yet anolha- aspect of flie presait invention is to provide a distributed scalable infrastmcture 
for the titOTiet that shifts the burdoi of W* cont«it distribution from the Contmt Provida: to a network of 
prefa:ably hundreds ofhosting savas deployed, for example, on a global basis. 

[0015] hi general, the present invention is a network ardiitecture fliat supports hosting on a 
tmly global scale. The invmtive framework allows a Contmt Provider to rq)licate its most popular contmt at 
an unlimited numba: of points Ihrou^ut the world. As an additional feature, tiie actual content that is 
replicated at any one geographic location is specifically tailored to viewers in that location. 
Moreover, content is automatically smt to flie location v^^ieie it is requested, without any effort or ovohead 
on the part of a Content Provider. 

[0016] This presQit invoition provides a global hosting framework to enable Content 
Providers to retain control of their content. 
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[0017] The hosting framework of the present invention comprises a set of servers 
operating in a distributed manner. The actual content to be served is preferably supported on a set 
of hosting servers (sometimes referred to as ghost servers). This content comprises HIML page 
objects that, conventionally, are served from a Content Provider site. In accordance with the 
invention, however, a base HIMLdocument portion of a Web page maybe saved fiom the Content 
Provider's site while one or more onbedded objects for the page are served from the hosting servans, 
prefo^ly, those hosting servers nearest the client machine. By serving the base HIMLdocument from 
the Content Provider's site, the Content Provider maintains control over the content 

[0018] The determination of which hosting server to use to serve a given embedded 
object is effected by other resources in the hosting framework. In particular, the framework 
includes a second set of servers (or server resources) that are configured to provide top level 
Domain Name Service (DNS). In addition, the framework also includes a third set of savers (or server 
resources) that are configured to provide low level DNS functionality. When a client machine issues an 
HTTP request to the Web site for a given Web page, the base IMTLdocumoit is saved from the 
Web site as previously noted. Embedded objects for the page preferably are served from particular 
hosting servers identified by the top- and low-level DNS servers. To locate the £?)propriate hosting 
servers to use, the top-level DNS server determines the user's location in the network to identify a 
given low-level DNS server to respond to the request for the embedded object. The top-level DNS 
server then redirects the request to the identified low-level DNS server that, in turn, resolves the 
request into an IP address for the given hosting server fliat serves the. object bade to the client. 

[0019] More generally, it is possible (and, in some cases, desirable) to have a hierarchy of 
DNS servCTS that consistmg of sevad levds. The lower one moves in ttie Meiarchy, the closer one gets 
to the best region. 
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[0020] A further aspect of the invention is a means by which content can be distributed and 
replicated throu^ a collection of servers so that the use of memory is optimized subject to the 
constraints that thaie are a su£Bd«it number of copies of any object to satisfy the demand, the copies of 
objects are spread so that no servo: becomes ovQ-loaded, copies tend to be located on the same servers as 
time moves forward, and copies arc located in rcgions close to the clients that arc requesting them. Thus, 
servers operating within the framework need not keep copies of all of the content database. 
Rather, given servers keep copies of a minimal amount of data so that the entire system provides 
the required level of service. This aspect of Ihe invention allows the hosting scheme to be far more 
efficient than schemes that cache everything everywhere, or that cache objects only in 
prespecified locations. 

[0021] The global hosting framework is fault tolerant at each level of operation. In 
particular, the top level DNS server returns a list of low-level DNS servers that may be used by 
the client to service the request for the embedded object Likewise, each hosting server preferably 
includes a buddy server that is used to assume the hosting responsibilities of its associated 
hosting server in the event of a failure condition. 

[0022] The foregoing has outlined some of the more pertinent objects and features of the 
present invention. These aspects should be construed to be merely illustrative of some of the 
more prominent features and plications of the inventioa Many other beneficial results can be 
attained by applying the disclosed invention in a diflfeent manno* or modifying the invention as will be 
desoibed. Accordingly, other aspects and a Mer undostanding of ttie invention may be had by rearing to 
the following Detailed Description of the Preforcd Embodiment 


03142K)(X)11 : DALLAS : 1254038.1 


6 


BRIEF DESCRIPTION OF THE DRAWINGS 

[0023] For a more complete undostanding of the present invention and the advantages thereof, 
reference should be made to the following Detailed Description taken in connection with the 
accompanying drawings in which: 

Figure 1 is a representative system in which the present invention is implemented; 

Figure 2 is a simplified representation of a markup language document illustrating the base 
document and a set of embedded objects; 

Figure 3 is a high level diagram of a global hosting system according to the present 
invention; 

Figure 4 is a simplified flowchart illustrating a method of processing a Web page to 
modified embedded object URLs that " is used in the present invention; 

Figure 5 is a simplified state diagram illustrating how the present invention responds to a 
HTTP request for a Web page; 

Figure 6 is a block diagram of a preferred embodiment of a content delivery service 
in which the present invention may be implemented; 

Figure 7 is a simplified block diagram illustrating how a content provide' site operates 
with the content delivery service; 

Figure 7 A illustrates how the DNS system resolves an end user request for an ARL; 

Figure 8 is a bipartite graph used for high level DNS map generation; and 

Figure 9 is a bipartite graph used for fast map generation. 

DETAILED DESCRIPTION OFTHE PREFERRED EMBODIMENT 
[0024] A known Internet client-server system is implemented as illustrated in Figure 1 . A 

client machine 10 is connected to a Web server 12 via a network 14. For illustrative purposes, 
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network 14 is the Internet, an intranet, an extranet or any other known network. Web server 12 
is one of a plurality of servers which are accessible by clients, one of which is illustrated by 
machine 10. A representative client machine includes a browser 16, which is a known software 
tool used to access the servers of the network. The Web server supports files (collectively rcfened to 
as a ' Wd)" site) in the form of hypertext documents and objects, Jn the Intemrt. paradigm, a n^oik path to a 
server is identified by a so-called Uniform Resource Locator (URL). 

[0025] A representative Web server 12 is a computer comprising a processor 18, an 
operating system 20, and a Web server program 22, such as Netscape Enterprise Server. The 
server 12 also indudes a display siq>poiting a graphical xnser interfece (GUI) for management and 
administration, and an Application Programming Interface (API) that provides extensions to enable 
q>plication developas to extaid and/or customize the core fimctionality thereof throu^ software programs 
including Common Gateway Merfece (CGI) programs, plug-ins, Soviets, active server pages, server 
side include (SSI) functions or the like. 

[0026] A representative Web client is a personal computer that is x86-, PowerPC®- or 
RISC-based, that includes an operating system such as IBM® OS/2® or Microsoft Windows *95, 
and that includes a Web browser, such as Netscape Navigator 4.0 (or higher), having a Java 
Virtual Machine (JVM) and support for application plug-ms or helper applications. A client may 
also be a notebook computer, a handheld computing device (e.g., a PDA), an Intemet appUance, 
or any other such device connectable to the computer network. 

[0027] As seen in Figure 2, a typical Web page comprises a markup language (e.g. 
HTML) master or base document 28, and many embedded objects (e.g., images, audio, video, or 
the like) 30. Thus, in a typical page, twenty or more embedded images or objects are quite 
conmion. Each of these images is an indepoidoit object in the Web, retrieved (or validated for 
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change) separately. The common behavior of a Web client, therdFoie, is to fetch the base HTML 
documait, and then immediately fetch the embedded objects, which are typically (but not always) 
located on the same servo*. According to the present invention, preferably the markup language base 
document 28 is served from the Web server (i.e., the Content Provider site) whereas a given numba 
(or perfi^ all) of the embedded objects are served from other servers. As will be seen, preferably a 
given embedded object is served from a server (other than the Web server itself) that is close to 
the client machine, that is not overloaded, and that is most likely to already have a current version 
of the required file. 

[0028] Referring now to Figure 3, this operation is achieved by the hosting system of the 
present invention. As will be seen, the hosting system 35 comprises a set of widely-deployed 
servers (or server resoxirces) that form a large, fault-tolerant infixistructure designed to serve Web 
content efficiently, effectively, and reliably to end users. The servers may be deployed globally, or 
across any desired geographic regions. As will be seen, the hosting system provides a distributed 
architecture for intelligently routing and replicating such content. To this end, the global hosting 
system 35 comprises three (3) basic types of servers (or server resources): hosting servers 
(sometimes called ghosts) 36, top-level DNS servers 38, and low-level DNS servers 40. Although 
not illustrated, there may be additional levels in the DNS hierarchy. Alternatively, there may be 
a single DNS level that combines the functionality of the top level and low-level sovos. In this 
illustrative embodiment, the inventive framework 35 is dq)loyed by an Internet Service Provider (ISP), 
allhou^ this is not a limitation of the presmt invaitioa The ISP or ISPs that deploy the inventive global 
hosting framework 35 preferably have a large number of machines that run both the ghost server 
component 36 and the low-level DNS component 40 on their networks. These machines are 
distributed throughout the network; preferably, they are concentrated around network exchange 
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points 42 and network access points 44, although this is not a requirement. In addition, the ISP 
preferably has a small number of machines running the top-level DNS 38 that may also be 
distributed throughout the network. 

[0029] Although not meant to be limiting, preferably a given server used in the 
framework 35 includes a processor, an operating system (e.g., Linux, UNIX, Windows NT, or 
the like), a Web server application, and a set of application routines used by the invention. These 
routines are conveniently unplemented in software as a set of instructions executed by the 
processor to perform various process or method steps as will be described in more detail 
below. The servers are preferably located at the edges of the network (e.g., in points of 
presence, or POPS). 

[0030] Several fectors may detamine v/bexc the hosting savers are placed in the network. 
Thus, for example, the server locations are prefaably detamined by a donand drivoi network map that 
allows the provider (e.g., the ISP) to monitor traffic requests. By studying traffic patterns, the ISP 
may optimize the server locations for the given traffic profiles. 

[0031] According to the present invention, a given Web page (comprising a base HTML 
document and a set of embedded objects) is served in a distributed manner. Thus, preferably, the 
base HTML document is served from the Content Provider that normally hosts the page. The 
embedded objects, or some subset thereof, are preferentially served from the hosting servers 36 
and, specifically, given hosting servers 36 that are near the client machine that in the first 
instance initiated the request for the Web page. In addition, preferably loads across the hosting 
servers are balanced to ensure that a given embedded object may be efificientiy served from a 
given hosting server near the client when such client requires that object to complete the page. 
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[0032] To serve the page contents in this manner, the URL associated with an 
embedded object is modified. As is well-known, each embedded object that may be served in a 
page has its own URL. Typically, the URL has a hostname identifying the Content Provider's 
site from where the object is conventionally served, i.e., without reference to the present 
invention. According to the invention, the embedded object URL is first modified, preferably in an off- 
line process, to condition the URL to be served by the global hosting servers. A flowchart 
illustrating the preferred method for miodifying the object URL is illustrated in Figure 4. 

[0033] The routine begins at stqp 50 by detemining whether all of the embedded objects in a 
given page have been processed. If so, the routine ends. If not, however, the routine gets the 
next embedded object at step 52. At step 54, a virtual server hostname is prepended into the URL 
for the given embedded object. The virtual server hostname includes a value (e.g., a number) 
that is gaioated, for exan5)le, by flying a given hash function to the URL. As is well-known, a 
hash function takes arbitrary loigth bit strings as inputs and produces fixed length bit strings (hash 
values) as outputs. Such functions satisfy two conditions: (1) it is iiifeasible to find two diffeans^ 
that produce the same hash value, and (2) given an input and its hash value, it is infeasible to find a 
different input with the same hash value. In step 54, the URL fijr the embedded object is hashed 
into a value xx,xxx that is then included in the virtual server hostname. This step randomly 
distributes the object to a given virtual server hostname. 

[0034] The present invention is not limited to generating the virtual server hostname by 
flying a hash firnction as desaibed above. As an altmiative and prefaied anbodiment, a virtual server 
hostname is generated as follows. Consider the representative hostname al234.g.akamaitech.net. 
The 1234 value, sometimes referred to as a serial number, preferably includes information about 
the object such as its size (big or small), its anticipated popularity, the date on which the object 
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was created, the identity of the Web site, the type of object (e.g., movie or static picture), and 
perhaps some random bits generated by a given random function. Of course, it is not required that 
any given serial number encode all of such information or even a significant number of such 
components. Indeed, in the simplest case, the serial number may be a simple integer. In any 
event, the information is encoded into a serial number in any convenient manner. Thus, for 
example, a first bit is used to denote size, a second bit is used to denote popularity, a set of 
additional bits is used to denote the date, and so forth. As noted above in the hashing example, 
the serial number is also xised for load balancing and for directing certain types of traffic to certain 
types of servers. Typically, most URLs on the same page have the same serial number to 
minimize the number of distinguished name (DN) accesses needed per page. This requirement is 
less important for larger objects. 

[0035] Thus, according to the present invention, a virtual server hostname is prepended 
into the URL for a given embedded object, and this hostname includes a value (or serial number) 
that is generated by applying a given fimction to the URL or object. That function may be a hash 
function, an encoding function, or the like. 

[0036] Turning now back to the flowchart, the routine lh«i continues at step 56 to include a 
given value in tiie object's URL. Preferably, the given value is generated by applying a given 
hash function to tiie embedded object. This step creates a unique fing^printofflie object that is usefol 
for determining wh^er the object has been modified, Theaneafler, the routine retums to step 50 and 
cycles. 

[0037] With the above as badcgiDund, the inventive global hosting fi-amework is now described 
in the context of a specific exanq)le. h particular, it is assumed ttiat a us^ of a client machine in Boston 
requests a Content Provider Web page normally hosted in Atlanta. For illustrative purposes, it is 
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assumed that the Content Provider is using the global hosting architecture within a network, which 
may be global, intemational, national, regional, local or private. Figure 5 shows the various COTiponents of 
the systmi and how the request from the chent is processed. This operation is not to be taken by way 
of limitation, as will be explained. 

[0038] Step 1: The browser sends a request to the Provider's Web site (Item 1). The 
Content Provider site in Atlanta lecdves the request in the same way that it does as if the global hosting 
framework were not being implemented. The differrace is in what is retumed by the Provider site. 
Instead of returning the usual page, accordmg to the invention, the Web site returns a page with 
embedded object URLs that are modified according to the method illustrated in the flowchart of Figure 
4. As previously described, the URLs preferably are changed as follows: 

[0039] Assume that there are 100,000 virtual ghost servers, even though there may only 
be a relatively small number (e.g., 100) physically present on the network. These virtual ghost 
servers or virtual ^osts are identified by the hostname: ^stxxxxx.^sting.com, whaie xxxxx is rqjlaced 
by a numbo- betweai 0 and 99,999. After the Contmt Provider Web site is iqxiated with new information, a 
script executing on the Contait Provider site is run that rewrites the embedded URLs. Preferably, the 
embedded URLs names are hashed into numbers between 0 and 99,999, although this range is not a 
limitation of the present invention. An embedded URL is then switched to refermce the virtual ghost 
with that number. For example, the following is an embedded URLfrom the Provider's site: 
<IMG SRC =ht4)y/www.prowdff.axi/I^^ 

[0040] If the serial number for the object referred to by this URLis the number 1467, then 
preferably the URLis rewritten to read: 
<IMGSRC = http: 

//ghostl477.gosting.akamai.com/www.provider.com/TECH/images/space.story.gif>. 

13 

03142:00011 : DALLAS : 1254038.1 


[0041] The use of serial numbers in this manner distributes the embedded URLs roughly 
evenly over the 100,000 virtual ghost server names. Note that the Provider site can still 
personalize the page by rearranging the various objects on the screen according to 
individual preferences. Moreover, the Provider can also insert advertisements dynamically 
and count how many people view each ad. 

[0042] According to the preferred embodiment, an additional modification to the 
embedded URLs is made to ensure that the ^obal hosting systen does not serve stale information. As 
previously described, preferably a hadi of the dala contained in the embedded URL is also inserted into 
the embedded URLitself That is, each embedded URLmay contain a fingerprint of the data to which 
it points. Whm the undedying information diang^ so does ttie fingeqnint, and this prevents users fix)ui 
referencing old data. 

[0043] Ihe second hash takes as iapat a sbeam of bits and outputs what is sometimes referred to 
as a fingerprint of the stream. The important property of the fingerprint is that two different 
streams almost surelypnxiuce two difFarat fingerprints. Examples of such hashes are the MD2 
and MD5 hash functions, however, other more transparent methods such as a sin^ple checksum may 
be used. For conaeteiess, assume that the output of the hash is a 128 bit signature. This signature 
can be interpreted as a number and then inserted into the embedded URL. For example, if the 
hash of tiie data in the picture space.stoiy.gif fiom the Provider web site is the numba- 28765, ttien the 
modified embedded URL would actually look as follows: <IMG 

SRC=http://ghostl467.ghosting.akamai.com/28765/www.provider.comyn^ 
tory.gif*>. 

[0044] Whenever a page is changed, preferably the hash for each embedded URL is 
recomputed and the URL is rewritten if necessary. If any of the URL's data dianges, for emnphy a new 
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and different picture is inserted with the name space.story.gif, then the hash of the data is 
different and therefore the URL itself will be diflferent This scheme prevaits the system fixm serving data 
that is stale as a result of updates to the original page. 

[0045] For example, assume that the picture space.stoiy.gif is replaced with a more up-to-date 
version on the Content Provider saver. Because the data of Ae pictures dianges, the hash of the URL 
changes as well. Thus, the new embedded URL looks the same except that a new numbo: is inserted 
for the fingaprint Any user that requests the page after the iqxiate receives a page that points to the new 
picture. The old picture is never referenced and cannot be mistakenly returned in place of the 
more up-to-date information. 

[0046] In summary, preferably there are two hashing operations that are done to modify 
the pages of the Content Provider. First, hashing can be a component of the process by which a 
serial number is selected to transform the domain name into a virtual ghost name. As will be 
seen, this first transformation serves to redirect clients to the global hosting system to retrieve 
the embedded URLs. Next, a hash of the data pointed to by the embedded URLs is computed and 
inserted into the URL. This second transformation serves to protect against serving stale and out- 
of-date content from the ghost servers. Preferably, tiiese two transformations are performed off- 
line and therefore do not pose potential performance bottlenecks. 

[0047] Generahzing, the preferred URL schema is as follows. The illustrative domain 
www.damainname.com/frontpage.ipg is transformed into: 

xxxx.yy.zzzz.net/aaaa/www.domainname.com/frontpage.jpg, where 

xxxx = serial number field 
yy = lower level DNS field 
zzzz = top level DNS field 
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aaaa = other information (e.g., fingerprint) field. 

[0048] If additional levels of the DNS hierarchy are used, then there may be additional 
lower level DNS fields, e.g., xxxx.YlYl.Y2Y2 zzz.net^aaaa/. . . 

[0049] Step 2: After receiving the initial page fiom the Content Provider site, the browser 
needs to load the embedded URLs to display the page. The first step in doing this is to contact 
the DNS server on the user's madrine (or at die user's BP) to resolve the altered hostname, in this case: 
ghostl467.ghosting.akamai.com. As will be seen, the global hosting architecture of the present 
invention manipulates the DNS system so that ttie name is resolved to one of the ^osts that is near the 
client and is likely to have the pa^ already. To appced^ how this is done, the following describes the 
progress of the DNS query that was initiated by the client. 

[0050] Step 3: As previously described, preferably there are two types of DNS servers in 
the inventive system: top-level and low-level The top level DNS savers 38 for ghosting.com 
have a special fiinction that is diflfamt from regular DNS servers like those of the .com domain. The 
top level DNS servers 38 include appropriate control routines that are used to determine wheiein 
the network a user is located, and then to direct the user to an akamai.com (i.e. , a low level DN S ) server 
40 that is close-by. Like the .com domain, akamai.com preferably has a number of top-level 
DNS servers 38 spread throughout the network for fault tolerance. Thus, a given top level DNS 
server 38 directs the user to a region in the Internet (having a collection of hosting servers 36 that 
may be used to satisfy the request for a given mibedded object) whereas the low levd DNS server 40 
(within the identified region) identifies a particular hosting serv^ within ttiat collection fix>m which the 
object is actually served. 

[0051] More generally, as noted above, the DNS process can contain sevaal levels of 

processing, eadi of vMdi serves to better direct the client to a ghost server. The ghost server name 

can also have more fields. For example, "al23.g.g.akamaitech.net" may be used instead of 
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"al23.ghost.akamai.com." If only one DNS level is used, a representative URL could be 
"al23.akamai.com. " 

[0052] Allhou^ other tediniques may be used, the usa^s location in the network preferably is 
deduced by looking at the IP address of the client machine making the request. In the present 
example, the DNS server is running on the machine of the xiser, although this is not a requirement. 
If the user is using an ISP DNS server, for example, the routines make the assumption that tiie user is 
located near (in the htanet sense) this server. Alternatively, the user's location or IP address could be 
directly encoded into the request sent to flie top level DNS. To determine the physical location of an IP 
address in the network, preferable, the top level DNS server builds a network map that is the used 
to identify the relevant location. 

[00531 Thus, for example, when a request comes in to a top level DNS for a resolution 
for al234.g.akamaitech.net, the top level DNS looks at the return address of the requester and then 
formulates the response based on that address according to a network map. In this example, the 
al234 is a serial number, the g is a field that refers to the lower level DNS, and akamaitech refers 
to the top level DNS. The network map preferably contains a hst of all Internet Protocol (IP) 
blocks and, for each IP block, the map determines where to direct the request. The map preferably 
is updated continually based on network conditions and traffic. 

[0054] After determining where in the network the request originated, the top level DNS 
server redirects the DNS request to a low level DNS server close to the user in the network. The 
ability to redirect requests is a standard feature in the DNS system. In addition, this redirection 
can be done in such a way that if the local low level DNS server is down, there is a backup server 
that is contacted. 
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[0055] Preferably, the TTL (time to live) stamp on these top level DNS redirections for 
the ghosting,com domain is set to be long. This allows DNS caching at the user's DNS servers 
and/or the ISP's DNS servers to prev^t the top level DNS servers from being overloaded. If the 
TTL for ghosting.akamai.com in the DNS server at the user's machine or ISP has expired, then a 
top level server is contacted, and a new redirection to a local low level ghosting.akamai.com DNS 
server is returned with a new TTL stamp. It should be noted the system does not cause a 
substantially larger number of top level DNS lookups than what is done in the current centralized 
hosting solutions. This is because the TTL of the top level redirections are set to be high and, 
thus, the vast majority of users are directed by their local DNS straight to a nearby low level 
ghosting.akamai.com DNS server. 

[0056] Moreover, fault tolerance for the top level DNS servers is provided automatically 
by DNS similarly to what is done for the popular .com domain. Fault tolerance for the low level 
DNS servers preferably is provided by returning a Ust of possible low level DNS servers instead 
of just a single server. If one of the low level DNS servers is down, the user will still be able to 
contact one on the list that is up and running. 

[0057] Fault tolerance can also be handled via an "overflow control" mechanism 
wherein the client is redirected to a low-level DNS in a region that is known to have sufficient 
capacity to serve the object. This alternate approach is very useful in scenarios where there is a 
large amount of demand from a specific region or when thare is reduced capacity in a region. In 
general, the clients are directed to regions in a way ttiat minimizes the overall latency experienced 
by clients subject to the constraint that no region becomes overloaded. Minimizing overall latency 
subject to the regional capacity constraints preferably is achieved using a min-cost 
multicommodity flow algorithm. 
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[0058] Step 4: At this point, the user has the address of a close-by ghosting.com DNS 
server 38. The user*s local DNS server contacts the close-by low level DNS server 40 and 
requests a translation for the name ghost 1467.ghosting.akamai,com. The local DNS server is 
responsible for returning the IP address of one of the ghost servers 36 on the network that is close 
to the user, not overloaded, and most likely to already have the required data. 

[0059] The basic mechanism for mapping the virtual ghost names to real ghosts is 
hashing. One preferred technique is so-called consistent hashing, as described in U.S. Patent No. 
6,430,618, issued August 6, 2002, and in U.S. Patent No. 6,553,420, issued April 22, 2003, each 
titled Method And Apparatus For Distributing Requests Among A Plurality Of Resources, and 
owned by the Massachusetts Listitute of Technology, whidi patmts are incorporated herein by referaice. 
Consistent hash functions make the system robust unda madiine Mures and oashes. It also allows the 
system to grow gracefully, without changing where most items are located and without perfect 
information about the system. 

[0060] According to the invention, the virtual ghost names may be hashed into real ghost 
addresses using a table lookup, where the table is continually updated based on network 
conditions and traffic in such a way to insure load balancing and fault tolerance. Preferably, a 
table of resolutions is created for each serial number. For example, serial number 1 resolves to 
ghost 2 and 5, serial numba- 2 resolves to ghost 3, serial number 3 resolves to ghosts 2,3,4, and 
so forth. The goal is to define the resolutions so that no ghost exceeds its capacity and that the 
total number of all ghosts in all resolutions is minimized. This is done to assure that the system 
can take maximal advantage of the available memory at each region. This is a m^'or advantage over 
existing load balancing sdiones that tend to cadie everything evaywhere or that only cadie ceitain obj ects in 
certain locations no matter what the loads are. In gma^ it is desirable to make assigpmoits so that 
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resolutions tend to stay consistent over time provided that the loads do not change too much in a 
short period of time. This medianism prefeably also takes into account how dose the ]^st is to the usa, 
and how heavily loaded the ghost is at the moment. 

[0061] Note that the same virtual ghost preferably is translated to different real ghost 
addresses according to where the user is located in the network. For example, assume that ghost 
server 18.98.0.17 is located in the United States and that ghost server 132.68.1.28 is located in 
Israel. A DNS request for ^ostl487.ghosting..akamai.com originating in Boston will resolve to 
1 8.98.0,17, while a request originating in Tel-Aviv will resolve to 1 32.68. 1 .28. 

[0062] The low-level DNS servers monitor the varioxis ghost servers to take into account 
their loads while translating virtual ghost names into leal addresses. This is handled by a software 
routine that runs on the ghosts and on the low level DNS servers. In one embodiment, the 
load information is circulated among the servers in a region so that they can compute 
resolutions for each serial number. One algorithm for computing resolutions works as follows. The 
saver first computes the projected load (based on number of user requests) for each serial number. The 
serial numbers are then processed in increasing order of load. For each serial number, a random 
priority list of desired servers is assigned using a consistent hashing mdhod Each serial 
number is Ifam resolved to the smallest initial segment of serva:s fix)m the priority list so that no sava- 
becomes ovoioaded. For example, if the priority list for a saial number is 2,5,3,1 ,6, then an attanpt is made 
first to try to map the load for the serial number to gjiost 2. If this overloads ghost 2, then the load 
is assigned to both ghosts 2 and 5. If this produced too much load on either of those servers, then 
the load is assigned to ghosts 2,3, and 5, and so forth. The projected load on a server can be 
computed by looking at all resolutions that contain that server and by adding the amount of load 
that is likely to be sent to that server fi-om that serial number. This method of producing 
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resolutions is most effective when used in an iterative fashion, wherein the assignments starts in 
a default state, where every serial number is mapped to every ghost. By refining the resolution 
table according to the previous procedure, the load is balanced using the minimum amount of 
replication (thereby maximally conserving the available memory in a region). 

(00631 The TTL for these low level DNS translations is set to be short to allow a quick 
response when heavy load is detected on one of the ghosts. The TTL is a parameter that can be 
manipulated by the system to insure a balance between timely response to high load on ghosts 
and the load induced on the low level DNS servers. Note, however, that even if the TTL for the 
low level DNS translation is set to 1-2 minutes, only a few of the users actually have to do a low 
level DNS lookup. Most users will see a DNS translation that is cached on their machine or at 
their ISP. Thus, most users go directly fix)m their local DNS server to the close-by ghost that has 
the data they want. Those users that actually do a low level DNS lookup have a voy small added 
latmcy, howeva this latency is small compared to the advantage of retrieving most of the data fi-om 
close by. 

[0064] As noted above, feult tolerance for the low level DNS servers is provided by having the 
top level DNS return a list of possible low level DNS servers instead of a single server address. 
The user's DNS system caches this Ust (part of the standard DNS system), and contacts one of the otho- 
servos on the list ifthefiist one is down same reason. The low 1^^^ DNS servers make use of a standard 
feature of DNS to provide an extra levd of feult tolerance for the ^ost servers. When a name is 
translated, instead of returning a sin^e name, a list of names is returned. If for some reason the primary 
feult toloance method for the ^osts Omown as the Buddy syston, whidi is desoibed below) fails, the client 
browso- will contact one of the other gjiosts on the list 
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[0065] Step 5: The browser then makes a request for an object named 
al23.ghosting.akamai.com/..7www.pro\ddencom/TECH/images/s^^ fiiom the close-by 

^ost. Note that the name of Ihe original s^ver (www.provider.com) prefoably is included as part of the 
URL. The software running on the ghost parses the page name into the original host name and the 
real page name tf a copy of flieffle is already stared on fee ^ost, then Ihe date If, 
however, no copy of the data on the ^host exists, a copy is retrieved fixwn the original server or another 
ghost server. Note that the ^st knows v/ho the raiginal server was because the name was encoded into 
the URL that was passed to tiie ^ost fiom the browser. Once a copy has beoi retrieved it is returned to ttie 
user, and preferably it is also stored on the ghost for answering future requests. 

[0066] As an additional safeguard, it may be pre&rable to dieck that the user is indeed close to 
the server. This can be done by examining &e IP address of the client before responding to the request 
for the file. This is useful in ttie rare case when the client's DNS server is far away from the 
client. In such a case, the ^ost saver can redirect the usa* to a doser server (or to anotha virtual address 
that is likely to be resolved to a server that is closer to the cUent). If the redirect is to a virtual server, 
then it must be tag^ to prevent fiiither redirections from taking place. In the preferred 
embodiment, redirection would only be done for large objects; thus, a check may be made before flying 
a redirection to be sure that the obj ect being requested ^ceeds a cextam overall size. 

[0067] Performance for long downloads can also be improved by dynamically changing 
the server to which a client is connected based on changing network conditions. This is 
especially helpful for audio and video downloads (where the connections can be long and where 
quality is especially important). In such cases, the user can be directed to an altanate serva* in mid- 
stream. The control structure for redirecting the client can be similar to ttiat desoibed above, but it can 
also indude software that is placed in the dienfs browse or media player. The software monitors the 
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performance of the client's connection and perhaps the status of the network as well. If it is 
deemed that the client's connection can be improved by changing the server, then the system directs the 
client to a new server for the rest of the connection. 

[0068] The invoitive ^obal hosting sdiane is a way for ^obal ISPs or conglomerates of 
regional ISPs to leverage tiieir network infrastructure to generate hosting revenue, and to save on 
network bandwidth. An ESP oflFmng Ihe invmtive global hosting scheme can give content providers the 
ability to distribute content to their users from the closest point on the ISPs network, thus oisuring 
fest and reliable access. Guaranteed web site performance is oitical for any web-based business, and 
global hosting allows for the creation of a service that satisfies this need. 

[0069] Global hosting according to the present invention also allows an ISP to control 
how and where content traverses its network. Global hosting servers can be set up at the edges of 
the ISP's network (at the many network exchange and access points, for example). This enables 
the ISP to serve content for sites that it hosts directly into the network exchange points and access 
points. Expensive backbone links no longer have to carry redundant traffic from tiie content 
provider's site to the network exchange and access points. Instead, the content is served directly 
out of the ISP's network, freeing valuable network resources for other traffic. 

[0070] Although global hosting reduces network traffic, it is also a method by which 
global ISPs may capture a piece of the rapidly expanding hosting market, which is currently 
estimated at over a billion dollars a year. 

[0071] The global hosting solution also provides numerous advantages to Content 
Providers, and, in particular, an efficient and cost-effective solution to improve the performance of their 
Web sites both domestically and intemationally. The inventive hosting software ensures Content 
Providers with fast and reliable Internet access by providing a means to distribute content to their 
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subscribers from the closest point on an ISP's network. In addition to other benefits described in 
more detail below, the global hosting solution also provides the important benefit of reducing 
network traffic. 

[0072] Once inexpensive global hosting servers are installed at the periphery of an ISP's 
network (i.e., at the many network exchange and access points), content is served directly into 
network exchange and access points. As a result of this efficient distribution of content directly 
from an ISP's network, the present invention substantially improves Web site performance. In 
contrast to current content distribution systems, the inventive global hosting solution does not 
require expensive backbone links to carry redundant traffic from the Content Provider's Web site 
to the network exchange and access points. 

[0073] Figure 6 is a diagram illustrating a preferred embodiment of the content delivery 
service in which the present invention may be implemented. As described above, the content 
delivery service comprises a (potentially) ^obal networic 100 of content deliver servos 102a-n, a dynamic 
dsmvain name service (DNS) system 104, and a launcher tool 106 that allows content to be tagged for 
inclusion on the network. Generally, the content delivoy system allows the network of content delivery 
savers to save a large number of clients efficiently. Although not meant to be limiting, a typical 
content server is a Pentium-based caching appliance running the Linux operating system with about 1 
GB RAM and betweoi about 40-80 GB of disk stOTage. As also sem in Figure 6, the contmt deiivoy savice 
may include a network operations centa* 1 12 for monitoring the network to ensure that key processes 
are rurming, systems have not exceeded c^>acity, and that r^ons are interacting properly. A content 
provider also has access to a monitoring suite 114 that includes tools for both realtime and historic 
analysis of customer data. One tool is a traffic analyzer 116 that provides multiple 
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monitoring views that enable quick access to network and customer-specific traffic infomaatioa A reporter 
1 1 8 allows for viewing of historical data. 

[0074] As has been described, high-pafoimance Internet content delivery is provided by 
directing requests for media-rich web objects to the content delivery service network. In particular, 
content is first tagged for deliver by the laundio* tool 106, \;^di is executed by a content provider at the 
content providea^s web site 108. The launcher 106 converts URLs to modified resource locators, 
called ARLs for convenience. Figure 7 illustrates how the v/db site 208 operates aft^ given 
^bedded objects in a wd) page have been modified with ARLs. As illustrated, ttie content provider's web 
servos 200 (prior to the present invention) serve the basic or "base" HTML page 202. However, the 
URLs of the embedded objects within that page have been modified (into ARLs) and no longer 
resolve to the content provider's site, but rather to the content delivery service network 205. 

[0075] As indicated in Figure 7A, tiie dynamic DNS system 204 resolves these ARLs to 
optimal network servers 202 rather than to the original web servers 200. Specifically, the DNS 
system 204 ensures that each request for an ARL is directed to the content server, e.g., server 
202a, that will most quickly service the request and that is likely to support the requested object. 
The DNS system 204 preferably comprises a set one or more high level DNS servers 212 that 
identify a particular region within the content delivery service network 205 to which a given 
ARL request (e.g., a9.g.akamaitech.net) should be directed. The DNS system 204 also includes a 
set of one or more low level DNS servers 214 within each such region. Low level DNS is used to 
identify the particular server 202a that should be the target of the given ARL request. High level 
DNS server match cUents* local name servers (e.g., local name server 216) with the low level 
DNS servers that can answer their queries most quickly, thus providing clients with fast access to 
up-to-date server mappings. To that end, mapping agents 207 provide each high level DNS 
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server 212 with a high level map, which is generated every few minutes, that optimally maps IP 
blocks to sets of low level DNS servers. Each low level DNS server is assigned to direct requests to 
one region of contort savers, usually the region in which the low level DNS server itself resides. In 
each IP block-to- {set of low level DNS servos} m^ing in flie hi^ level map, all of Ihe low level 
DNS servers in the set are assigned to the same region, so it is, in effect, a mapping of IP blocks to 
server regions. A fast map, generated every few seconds, assigns a server region to each low 
level DNS server. The content delivery savice includes appropriate control routines to create the request- 
to-sover m^ings that are based on up-to-the-seoond information on current Internet traffic conditions 
(derived from the mapping agents). These mappings enable the service to route end-usa: requests 
around network problem areas and delivers content to users in the fastest, most efficient way 
possible. 

[0076] As described above, the presait invention uses one or more network maps to facilitate 
resolution of web browser requests. To this end, monitoring agents deployed throughout 
the content delivery sa:vioe's netwoik prefaably continuously measure the paformanoe of the htemet 
Hiey preferably combine a number of different performance metrics to determine the cost of 
communication between regions of end-user machines (IP blocks)and servers in the network. The 
perfomiance metrics used include one or more of the following: network link latency, packet loss 
rates at routers, available link capacity, network hop distance, AS pafli information, and test page 
download times. All of the measurements are aggregated via a fat tree network overlay and 
transferred to mapping agents. In a fat tree, the cqiadty of links and nodes inaeases in hi^o: levels at 
ttie tree, with the root node having the largest capacity. The mapping agents input the computed 
communication costs into a large-scale global optimization algorithm that constmcts the m^ used 
by flie hi^ levd and low levd DNS servers. In one illustrative embodiment, the performance metrics 
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include geo*iouting data, BGP data, and ICMP (ping) data, which data is then integrated to facilitate the 
map generation process. A more detailed description of a preferred map generation technique is 
set forth below. 

[0077] The job of the high level DNS servers is to match clients' local name servos with the 
contait delivery service's low level DNS serves that can answo- the quaties most quiddy, this providing 
clients with fast access to up^to-date server mappings. To that aid, Ihe mspping agents each hig^ level 
DNS saver with a table (a hi^-level map) tiiat optimally maps IP blocks to sets of low level DNS 
servers. Upon receiving a query from a local name server, a high level DNS server returns a Ust of 
low level DNS s^ers fliat the higji-level map associates with the IP address of the local name 
server making the query. This table, or high-level map, is preferably recomputed based on 
new traffic measurements and redistributed to high level DNS servers every 5-10 minutes; 
therefore, to keep local name servers up-to-date with duinging traffic conditions, the TTL on hi^ level 
DNS responses is preferably 20 minutes. 

[0078] To confute this table (called the hi^-level map), the mapping agents set up a 
bipartite graph of the following form: qpoximately 180,000 nodes on the left, each rqjresenting the 
current load coming fiom an IP blodq and ^poximatdy 1 0,000 nodes oi the ri^ eadi rqxresenting a single 
exit link at a data center (e.g., 100 Mbps link from AboveNet San Jose - Level 3). Figure 8 
illustrates the bipartite graph for HLDNS map genoatioa Hie greph becomes fiilly connected, 
meaning there is a link between eveiy (IP block, exit link) pair. Eadi link is assigned a cost based on the 
collected networic measuranaits that reflects the quality of communication betweai the givai IP block 
and exit link location. A multi-commodity bipartite min-cost flow algorithm is nm on tiiis gr^h, 
resulting in an optimal mapping of IP blocks to exit links (i.e., total cost across all links is minimized). 
An IP blodc-to-{srt of LLDNS savers) rn^ing is then derived using a statically defined table that 
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associates a (usually) nearby LLDNS server with each exit link. This final mapping is stored in 
a table and distributed to all high level DNS servers. 

[0079] The table used by high level DNS servers can take up to 10 minutes to compute 
because it takes into account each of around 1 80,000 IP blocks separately. To map requests to 
servers optimally, based on up-to-the-second traffic conditions, the low level DNS servers may be 
provided with a second, fast map, whidi is iqxlated evay 10 seconds. Hiey use the fast map along with a 
sophisticated load-balancing algorithm to direct each request to the optimal ghost server. 

[0080] Each low level DNS server is assigned to direct requests to one region of content 
servess, usually Ihe region in whidi the low level DNS server itself resides. In each IP block-to-(set of 
LLDNS servers) mapping in the high level map, all of the LLDNS servos in the set are assigned to the 
same region, so it is, in effect, a m^ing of IP blocks to server r^ons. Preferably, every few seconds 
(e.g., 10), a fast map is confuted to refine die decision made by the hi^ level m^. The fast map 
assigns a server region to each LLDNS server. To achieve this fast response time, the scale of the 
matdiing ptoblOTi is reduced by grouping toother IP blodcs that were matched to the same set of 
LLDNS servers in the high-level map. A second bq)artite gr^h is set ip, as shown in Rgqre 9, witti each 
of around 10,000 nodes on the left representing the aggregate load from an ff block group. Each 
of approximately 100 nodes on the right represents the capacity of a particular server region. As 
before, the graph becomes folly connected with cost links representing the cost of commxmication 
between each IP block group and server region. Running a bipartite min-cost flow on this much 
smaller graph takes only a few seconds, producing a single mapping from each IP block group to 
a region. The LLDNS servers associated with each IP block group (in the high-level map) are 
then assigned to serve the corresponding server region. In this way, the content deUvery service 
ensures that cHent requests are sent to the server region that will service them most quickly. 
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[0081] The following is a detailed descaiption of a prefarod technique for generating network 
maps for use in the content delivery service. 

[00821 The Map directs IP addresses to data centers. The Map is installed on the top- 
level name server and is updated as network conditions change. Five types of information make 
up a Map: 

1 . GEO Geographic information gleaned from the three main registration databases: 
RIPE, ARIN, and APNIC. 

2. BGP Information obtained from BGP (Border Gateway Protocol) routing tables. 
Machines in the field act as silent peers collecting and aggregating routing information. 

3. AS Autonomous System numbers of the various BP blocks, also obtained by 
parsing the routing tables. 

4. VIP Information obtained from Keynote, a web performance measuring service. 
This information is used to map certain parts of the Intemet to specific data centers with a view to 
improving the Keynote measurements. 

5. Traceroute Traceroutes are run and the information is collected and processed. 
[0083] Geographic information is relatively static, but the BGP, AS, and VDP 

information are dynamically assembled. All the information is collectively used to generate a 
candidate list of data centers for each IP block. Once the candidate list is generated, a min-cost 
flow algorithm selects one data center for each DP block. 

[0084] In order to map IP addresses to geographic locations, a geographic map based on 
information from a number of sources is created. The main source of information was the 
WHOIS database provided by ARIN, APNIC, and RIPE. 
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[0085] Internet routers use the Border Gateway Protocol (BGP) to build their routing 
tables. BGP peers (routers running BGP) exchange route advertisements, which are then 
examined and installed in the tables. A route for a particular CIDR (Classless Inter-domain 
Routing) block, e.g., 128.2.0.0/16, contains information including the next hop (an IP address) 
on the path to the CIDR block, and the list of autonomous systems (AS) that the route passes 
through on the way to the block. For more information on BGP, see Intemet Routing 
Architectures , written by Bassam Halabi (Indianapolis: Cicso Press, 1997). 

[0086] A map generation algorithm is based primarily on BGP data. BGP peers are 
established at data centers having servers so that the routes that data would take to clients if 
served from each center can be observed. Generally, the client is mapped to a center whose 
routing table indicates that the client is "nearby". The most important criterion in decided 
whether a cUent is nearby is the number of AS hops on the path (number of different AS's 
encountered). The location of the next hop is also considered. If the next hop is located at a 
different hub or data center on the same backbone, then the client is served directly from that 
data center. 

[0087] BGP peers participate in intemal BGP (iBGP) sessions with routers at the 
different data centers. The alternative to iBGP is extemal BGP (eBGP). iBGP is preferred 
because the iBGP protocol specifies that next hop information pass unchanged from one BGP 
peer to another. As a result, BGP peer receives and stores a copy of the router's next hop. 
Essentially the peer stores a mirror image of the router's entire routing table. In eBGP, 
however, the router advertising a route replaces the next hop in that route with its own BP 
address. The simples way to configure the iBGP session is for the peer to be a route-reflector- 
client, and for the router to act as a route reflector. A "collector" BGP peer collects the routing 
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tables from all of the other BGP peers. This machines establishes an eBGP session with each of 
the other BGP peers that send their routing tables to this machine. This machine is not involved 
in any iBGP peering sessions with routers. 

[0088] The AS info identifies which Autonomous System a given CIDR Block belongs. 

[0089] A number of CIDR blocks have VIP assignments. There are five types of VIP 
assignments: geographic, autonomous system, CIDR block, Keynote, and Keynote fix. All of 
these are entered manually except VIP assignments. These are generated automatically using 
Keynote performance logs. Keynote fix covers any Keynote assignments that can override 
existing Keynote assignments. VIP assignment are processed in the following order: 

geographic 

autonomous system 

CIDR block 

Keynote 

Keynote fix 

[0090] In general, each of these five steps overrules the previous step. For example, if 
the geographic step maps a particular CIDR block to Exodus Virginia, but the autonomous 
system step maps the same block or a subset of the block to BBN Cambridge, then that block, (or 
its subset) is mapped to BBN Cambridge. 

[0091] The file lookup/vip_geo makes broad VIP assignments based on geographic 
locations. This file permits the mapping for all addresses classified as belonging to a particular 
continent, country, state, or province to a particular data center or choice of data centers. 
Because geographic proximity is not necessarily an indicator of network performance, 
geographic VIP assignments should be made sparingly. At present, two reasons for mapping 
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entire continents to UUNet are 1) that UUNet has good international connectivity, and 2) no data 
centers are outside of the U.S. 

[0092] Because the assignment process will be executed periodically without manual 
intervention, an effort has been made to ensure that it succeeds in producing some assignment, 
given the input data, even if the input data imposes inconsistent requirements. In particular, it 
may not be possible to satisfy upper or lower bounds on the bandwidth of traffic served from 
each data center, due to insufficient traffic from the CIDR blocks listing that data center among 
their candidates. In this case the assignment process does its best. 

[0093] The script creates a minimum-cost-flow-problem input instance that is fed into 
the CS-2 program. The resulting graph contains four layers. The first two layers form a large 
bipartite graph. 

[0094] The left side of this bipartite graph contains a node for each CIDR block. Each of 
these nodes is a source with a positive flow supply, equal to the amount of flow predicted for the 
corresponding block in the tmp-tally file (whose creation is described in another document) The 
units of flow are in bits/second. If the prediction for a CIDR block is 0, it is increased to 1, 
ensuring that the block will be mapped somewhere. 

[0095] The right side of the graph contains a node for each data center. These nodes are 
neither sources nor sinks for flow, as their supplies are 0. 

[0096] Between a CIDR block on the left and every candidate data center for that block 
on the right is an edge (specified in tmp-candidates). Each edge has a capacity upper bound 
equal to the supply of the CIDR block, and with zero cost. The lower bound on the capacity of 
each of these edges is zero. Thus, all of the flow from a CIDR block could be carried along any 
one of the edges out of the block. 
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[0097] The CS-2 program requires that a flow be feasible. If the flow is not feasible, 
CS-2 does not return the largest flow that it can find, but instead returns only an error message. 
In designing the problem instance, capacities are set to ensure that some feasible solution exists. 
Sources of flow have been defined, but no sinks. 

[0098] The graph contains a small set of auxiliary nodes that do not below to either the 
left or right side of the bipartite graphs. The rightmost node is a sink. This is the only node in the 
fourth layer, and also the only node with a flow demand. The flow demand is equal to the sum 
of the flow supplies of all the CIDR blocks. 

[0099] A set of intermediate nodes sits between the right-hand side of the bipartite 
graphs and the sink. Each of these nodes has in-degree one and out-degree one. The number of 
intermediate nodes is determined by the file lookup/assignment_grades. This file specifies a 
const function that is used in calculating how expensive it will be to use the bandwidth available 
at a data center. Each line in this file is an integer. If there are k lines in the file, then there will 
be k edges fi-om each data center on the second layer of the graph to distinct nodes on the third 
layer. The cost per unit of flow on the Vth edge is set to be the cost indicated on the z'th line of 
the file. The lower bound on the capacity of each of these edges is 0. The upper bound on the 
capacity of each edge is the upper bound on the bandwidth for the data center divided by k with 
one exception. The k'th edge has an upper bound on capacity equal to the sum of the supplies of 
all of the CIDR block. Typically the costs listed in lookup/assignment_^ades increase with line 
number. The idea is that using the first l/k of the capacity of a data center should be 
inexpensive, using the next l/k should be a little more expensive, and so on. the last l/k, and 
any overflow beyond the capacity of the data center should be most expensive. Note that the last 
edge alone makes it possible, but costly, to route all of the flow through a single data center. 
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[00100] Each node v in the third layer is connected by a single edge to the sink on the 
forth layer. Each of these edges has a capacity lower bound of zero and a cost of zero. The 
capacity upper bound is equal to the capacity upper bound of the edge from the secondlayer to v. 
Hence, any flow that makes it to the third layer can continue on to the fourth layer without 
additional cost. 

[00101] The lower boxmds are not enforced on flow indicated in the 
lookup/assignment_datacenters file by placed lower bounds on the opacities of edges, because to 
enforce the upper bounds, the flow problem might be infeasible, and in this case the program CS- 
2 oudl fail and retum an error. The script bin/assignment buildgraph is intended to always 
generate a graph for which a feasible flow exits. 

[00102] To encourage data centers to meet their lower bounds, the cost for the edges 
corresponding to flow below the lower bounds is set to 0. If a data center, for example, has a 
lower bound that is 30% of its upper boimd, and A: is 10, then the first three edges with capacity k 
will have their costs set to 0, regardless of the values of these edges specified in the 
lookup/assignment_grades file. 

[00103] The CS-2 progam produces a feasible flow solution of minimum cost. The 
solution does not directly yield an assignment for each CIDR block, however, because the flow 
from a CIDR block to a sink may consist of multiple flow paths. It would be ideal if there were 
only one flow path for any CIDR block, but unfortunately the problem of finding a single-path 
feasible flow of minimum cost with varying flow values is difficult. If all flow values are the 
same and the graph is bipartite, the problem - called the assignment problem - can be solve in 
polynominal time. 
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[00104] The last step in the assignment process is to convert this multi-path flow into a 
single-path flow, which is equivalent to an assignment. A "randomized rounding" techniquest is 
used. The idea is to select from one of the multiple flow paths from a source node, i.e., a CIDR 
block on the first layer, to the sink at random, in proportion to the flow that is routed along each 
path. The fr>llowing example helps to illustrate this concept. If 10 units of flow are routed from 
mode u to the sink, and the flow follows three paths with flow values of 2, 3, and 5, then the first 
path is selected with probability 2/10, the second with probability 3/10, and third with probability 
5/10. It is possible to prove bounds on the effectiveness of this strategy, i.e., to prove that with 
high probability it is unlikely any data center will receive much more or much less flow than 
expected. 

[00105] The problems solved are quite large, typically 150,000 CIDR blocks and 30+ 
data centers. The roudned solutions are usually very close in both the total cost and the flow 
imposed on any data center to the cost and flow of the minimum-cost multi-path solution. More 
sophisticated deterministic algorithms could be applied here. In particular, there is an algorithm 
that guarantees that no data center receives more flow than that imposed by the optimal solution 
plus the flow out of one CIDR block. For a bipartite graph, the algorithm is a degenerative case 
for converting multi-path flows to single-path flows. 

[00106] The correspondence between the single-path flow and an assignment is 
straightforward. The first edge on the one flow path from a souce node u on the first layer (a 
CIDR block) to the sink must be an edge from w to a node v on the second layer (a data center). 
The CIDR block for u is assignment to v. For this correpsondence to produce an assignment for 
every CIDR block, there must be at least a unit of flow out of each node in the first layer. This is 
why one is added to any supplies with value 0 before invoking CS-2. 
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[00107] The above-described functionality of each of the components of the global hosting 
architecture preferably is implmiaited in software executable in a processor, namely, as a set of instructions 
or program code in a code module resident in the random access memory of the computer. Until 
required by the computer, the sd of instnictions may be stored in anolha computer memory, for example, in 
a hard disk drive, or in a removable memory such as an ppticd disk (for ev^ 

disk (for eventual use in a ioppy disk drive), or downloaded via the hitemet or other computer 
network. 

[00108] In addition, although the various methods described are conveniently implemented 
in a genaal purpose computer selecdvdy activated or reconfigured by software, one of ordinary skill in 
the art would also recognize that such methods may be carried out in hardware, in firmware, or in more 
specialized apparatus constructed to perform the required method steps. 

[00109] Furtha*, as used herein, a Web "clioit" should be broadly construed to mean any 
computer or component thereof directly or indirectly connected or connectable in any known or later- 
developed manner to a computer network, such as ttie tolmieL The temi Web "sava' ' should also be broadly 
construed to mean a computer, computer platfoma, an adjunct to a computer or platform, or any 
component ttiereof Of course, a "client" should be broadly construed to mean one who requests or 
gets the file, and "serv^" is the entity which downloads the file. 

[00110] Having thus desaibed our invmtion, what we claim as new and desire to secure by 
Letters Patent is set forth in the following claims. 
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